XenC: An Open-Source Tool for Data Selection in Natural Language Processing
نویسندگان
چکیده
منابع مشابه
XenC: An Open-Source Tool for Data Selection in Natural Language Processing
In this paper we describe XenC, an open-source tool for data selection aimed at Natural Language Processing (NLP) in general and Statistical Machine Translation (SMT) or Automatic Speech Recognition (ASR) in particular. Usually, when building a SMT or ASR system, the considered task is related to a specific domain of application, like news articles or scientific talks for instance. The goal of ...
متن کاملTreex - an open-source framework for natural language processing
The present paper describes Treex (formerly TectoMT), a multi-purpose open-source framework for developing Natural Language Processing applications. It facilitates the development by exploiting a wide range of software modules already integrated in Treex, such as tools for sentence segmentation, tokenization, morphological analysis, part-of-speech tagging, shallow and deep syntax parsing, named...
متن کاملVarClass: An Open-source Language Identification Tool for Language Varieties
This paper presents VarClass, an open-source tool for language identification available both to be downloaded as well as through a graphical user-friendly interface. The main difference of VarClass in comparison to other state-of-the-art language identification tools is its focus on language varieties. General purpose language identification tools do not take language varieties into account and...
متن کاملMeshLab: an Open-Source Mesh Processing Tool
The paper presents MeshLab, an open source, extensible, mesh processing system that has been developed at the Visual Computing Lab of the ISTI-CNR with the helps of tens of students. We will describe the MeshLab architecture, its main features and design objectives discussing what strategies have been used to support its development. Various examples of the practical uses of MeshLab in research...
متن کاملDevelopment of an Open Source Natural Language Generation Tool for Finnish
We present an open source Python library to automatically produce syntactically correct Finnish sentences when only lemmas and their relations are provided. The tool resolves automatically morphosyntax in the sentence such as agreement and government rules and uses Omorfi to produce the correct morphological forms. In this paper, we discuss how case government can be learned automatically from ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Prague Bulletin of Mathematical Linguistics
سال: 2013
ISSN: 1804-0462,0032-6585
DOI: 10.2478/pralin-2013-0013